An Analysis on National Maternal Mortality
2025-08-04
Generalized Linear Mixed Models (GLMMs) are a flexible class of statistical models that combine the features of two powerful tools: Generalized Linear Models (GLMs) and Mixed-Effects Models (Agresti 2015)
Can model non-normal outcome variables, such as binary, count, or proportion data
Incorporate random effects, which account for variation due to grouping or clustering in the data, correlated observations, and overdispersion
Handling hierarchical or grouped data (e.g., students within classrooms, patients within clinics) (Lee and Nelder 1996)
Modeling non-normal outcomes, such as:
Binary outcomes (using logistic GLMMs) (Wang et al. 2017)
Count data (using Poisson or negative binomial GLMMs) (Candy 2000)
Proportions or rates (Salinas Ruı́z et al. 2023)
Improving inference by accounting for both fixed effects (predictors of interest) and random effects (random variation across groups)
Reducing bias and inflated Type I error rates that can result from ignoring data structure (Thompson et al. 2022)
Frequently used in fields like medicine, ecology, education, and social sciences
One study explores the benefits of a zero-inflated Poisson GLMM (to handle count data has an overabundance of zeroes) applied to maternal mortality data in Ghana (Tawiah, Iddi, and Lotsi 2020)
Another study uses GLMM to investigate the effect of particulate matter on child and maternal mortality globally
Let
\(\mathbf{y}\) be a \(Nx1\) column vector outcome variable
\(\mathbf{X}\) be a \(Nxp\) matrix for the \(p\) predictor variables
\(\boldsymbol{\beta}\) be a \(px1\) column vector of the fixed effects coefficients
\(\mathbf{Z}\) is a \(Nxq\) matrix of the \(q\) random effects
\(\mathbf{u}\) is a \(qx1\) vector of random effects, and
\(\boldsymbol{\epsilon}\) is a \(Nx1\) column vector of the residuals
Then the general equation for the model is given by:
\[\mathbf{y}=\mathbf{X}\boldsymbol{\beta}+\mathbf{Z}{u}+\boldsymbol{\epsilon}\]
The GLMM Model process is that the analyis of variance model or the equation is a vector of linear predictors with of unknown parameters estimates. Each distribution has is its own probability function which we will utilize the Negative Binomial as GLMMs typically include a link function that relates the response variable \(\mathbf{y}\) to a linear predictor, \(\eta\), which excludes the residuals. So then \[\boldsymbol{\eta}=\mathbf{X}\boldsymbol{\beta}+\mathbf{Z}\boldsymbol{\lambda}\]
The link function is \(g(\cdot)\), where \[g(E(\mathbf{y}))=\boldsymbol{\eta}\] where \(E(\mathbf{y})\) is the expectation of . The choice of link function depends on the outcome distribution. For this paper our data demonstrates a Negative Binomial distribution for overdispered count data, so we will use a log link function.
\[g(\cdot)=log_e(\cdot)\]
In the GLMM model the parameter estimates is solved by reducing the negative log likelihood functions (Salinas Ruı́z et al. 2023). The means or the least square means are derivative of the parameter estimates and are found on the model scale. The link function, negative binomial log link, will convert the mean estimates at the model scale to the data scale. Negative Binomial Distribution: \[ f(y;k,{\mu})=\frac{\Gamma(y+k)}{\Gamma(k)*(y+1)}\left(\frac{k}{\mu+k}\right)^{k}\left(1-\frac{k}{\mu+k}\right)^{y} \]
Zuur writes that the Negative Binomial Distribution has two parameters \({\mu}\) and \(k\) (Zuur et al. 2009). The symbol \({\Gamma}\) is defined as \({\Gamma(y+1)=(y+1)!}\) The Mean of Negative Binomial is given: \(E(Y)= {\mu}\) The Variance of Negative binomial is given; \(Var(Y)= {\mu}+ \left(\frac{\mu^2}{k}\right)\), where second term determines the overdispersion, \(k\) is called the dispersion parameter and indirectly determines overdispersion. If \(k\) is significantly large relative to \({\mu^2}\) then the second term will approximate to zero and a Poisson distribution may as well be used. However, the smaller the \(k\) value the larger the overdispersion may form and then negative binomial is the correct log link to utilize.
The response variable and the predictors have a linear relationship within the levels of random effects.
The response variable is assumed to follow a negative binomial distribution, with \(\sigma^2>\mu\).
The residuals and random effects are independent.
The random effects are assumed to be normally distributed, with mean 0 and variance \(\sigma\).
Negative Binomial ideal for count data that is overdispersed (which we suspect as it is population data)
Longitudinal data is not independent so a GLMM is necessary so we can include time as a random effect
Accounts for variation in the model that would not be explained by our fixed effects
Analysis performed with R (R Core Team 2025)
Vital Statistics Rapid Release (VSRR) Provisional Maternal Death Counts and Rates, in the form of a .csv
Published by National Vital Statistics System, a collaboration between the National Center for Health Statistics (NCHS) and state vital record offices
Monthly death counts and death rates by race/ethnicity, age, and overall
Data from January 2019 to December 2024
Data is provisional and updated quarterly; becomes more reliable with more updates
Maternal Deaths between 1 and 9 are suppressed for privacy reasons
“Native Hawaiian or Other Pacific Islander, Non-Hispanic” has 70 NAs for Maternal Mortality, omitting this subgroup entirely
“American Indian or Alaska Native, Non-Hispanic” has 58 NAs for Maternal Mortality Rate, not using rate in our model, omitting will not affect modeling
| Name | Fixed_Effects | Random_Effects | Offset |
|---|---|---|---|
| all_glmmodel_nb | Ethnicity, Age_Group, Dobbs_Era | Year | log(Live_Births) |
| ethnicity_agegroup_glmmodel_nb | Ethnicity, Age_Group | Year | log(Live_Births) |
| allno_glmmodel_nb | Ethnicity, Age_Group, Dobbs_Era | Year | None |
| ethnicity_agegroupno_glmmodel_nb | Ethnicity, Age_Group | Year | None |
Family: nbinom2 ( log )
Formula:
Maternal_Deaths ~ Ethnicity + Age_Group + Dobbs_Era + (1 | Year)
Data: deaths_df3
Offset: log(Live_Births)
AIC BIC logLik -2*log(L) df.resid
4567.5 4614.2 -2272.7 4545.5 507
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
Year (Intercept) 0.03158 0.1777
Number of obs: 518, groups: Year, 6
Dispersion parameter for nbinom2 family (): 148
Conditional model:
Estimate Std. Error
(Intercept) -8.79687 0.07678
EthnicityBlack, Non-Hispanic 1.31563 0.02627
EthnicityWhite, Non-Hispanic 0.28329 0.02604
EthnicityHispanic 0.16204 0.02704
EthnicityAmerican Indian or Alaska Native, Non-Hispanic 1.65241 0.06285
EthnicityUnknown 0.04364 0.02750
Age_Group25-39 years 0.38817 0.01821
Age_Group40 years and over 1.81031 0.02053
Age_GroupUnknown NA NA
Dobbs_EraPost-Dobbs -0.22081 0.02303
z value Pr(>|z|)
(Intercept) -114.57 < 2e-16 ***
EthnicityBlack, Non-Hispanic 50.08 < 2e-16 ***
EthnicityWhite, Non-Hispanic 10.88 < 2e-16 ***
EthnicityHispanic 5.99 2.06e-09 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic 26.29 < 2e-16 ***
EthnicityUnknown 1.59 0.113
Age_Group25-39 years 21.31 < 2e-16 ***
Age_Group40 years and over 88.17 < 2e-16 ***
Age_GroupUnknown NA NA
Dobbs_EraPost-Dobbs -9.59 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Family: nbinom2 ( log )
Formula: Maternal_Deaths ~ Ethnicity + Age_Group + (1 | Year)
Data: deaths_df3
Offset: log(Live_Births)
AIC BIC logLik -2*log(L) df.resid
4647.6 4690.1 -2313.8 4627.6 508
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
Year (Intercept) 0.0326 0.1805
Number of obs: 518, groups: Year, 6
Dispersion parameter for nbinom2 family (): 107
Conditional model:
Estimate Std. Error
(Intercept) -8.88725 0.07754
EthnicityBlack, Non-Hispanic 1.31581 0.02763
EthnicityWhite, Non-Hispanic 0.28254 0.02742
EthnicityHispanic 0.16063 0.02836
EthnicityAmerican Indian or Alaska Native, Non-Hispanic 1.68338 0.06458
EthnicityUnknown 0.04366 0.02881
Age_Group25-39 years 0.38788 0.02012
Age_Group40 years and over 1.80898 0.02225
Age_GroupUnknown NA NA
z value Pr(>|z|)
(Intercept) -114.61 < 2e-16 ***
EthnicityBlack, Non-Hispanic 47.62 < 2e-16 ***
EthnicityWhite, Non-Hispanic 10.31 < 2e-16 ***
EthnicityHispanic 5.66 1.49e-08 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic 26.07 < 2e-16 ***
EthnicityUnknown 1.52 0.13
Age_Group25-39 years 19.28 < 2e-16 ***
Age_Group40 years and over 81.32 < 2e-16 ***
Age_GroupUnknown NA NA
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Family: nbinom2 ( log )
Formula:
Maternal_Deaths ~ Ethnicity + Age_Group + Dobbs_Era + (1 | Year)
Data: deaths_df3
AIC BIC logLik -2*log(L) df.resid
4560.6 4607.3 -2269.3 4538.6 507
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
Year (Intercept) 0.02885 0.1698
Number of obs: 518, groups: Year, 6
Dispersion parameter for nbinom2 family (): 155
Conditional model:
Estimate Std. Error
(Intercept) 3.51596 0.07371
EthnicityBlack, Non-Hispanic 2.16044 0.02612
EthnicityWhite, Non-Hispanic 2.40591 0.02589
EthnicityHispanic 1.56973 0.02690
EthnicityAmerican Indian or Alaska Native, Non-Hispanic -0.48610 0.06264
EthnicityUnknown 1.33561 0.02736
Age_Group25-39 years 1.59868 0.01798
Age_Group40 years and over 0.03846 0.02033
Age_GroupUnknown NA NA
Dobbs_EraPost-Dobbs -0.22215 0.02267
z value Pr(>|z|)
(Intercept) 47.70 < 2e-16 ***
EthnicityBlack, Non-Hispanic 82.72 < 2e-16 ***
EthnicityWhite, Non-Hispanic 92.91 < 2e-16 ***
EthnicityHispanic 58.36 < 2e-16 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic -7.76 8.5e-15 ***
EthnicityUnknown 48.82 < 2e-16 ***
Age_Group25-39 years 88.90 < 2e-16 ***
Age_Group40 years and over 1.89 0.0585 .
Age_GroupUnknown NA NA
Dobbs_EraPost-Dobbs -9.80 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Family: nbinom2 ( log )
Formula: Maternal_Deaths ~ Ethnicity + Age_Group + (1 | Year)
Data: deaths_df3
AIC BIC logLik -2*log(L) df.resid
4643.6 4686.1 -2311.8 4623.6 508
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
Year (Intercept) 0.03063 0.175
Number of obs: 518, groups: Year, 6
Dispersion parameter for nbinom2 family (): 109
Conditional model:
Estimate Std. Error
(Intercept) 3.42547 0.07539
EthnicityBlack, Non-Hispanic 2.16003 0.02752
EthnicityWhite, Non-Hispanic 2.40480 0.02731
EthnicityHispanic 1.56795 0.02826
EthnicityAmerican Indian or Alaska Native, Non-Hispanic -0.45485 0.06442
EthnicityUnknown 1.33507 0.02870
Age_Group25-39 years 1.59851 0.01995
Age_Group40 years and over 0.03698 0.02209
Age_GroupUnknown NA NA
z value Pr(>|z|)
(Intercept) 45.44 < 2e-16 ***
EthnicityBlack, Non-Hispanic 78.49 < 2e-16 ***
EthnicityWhite, Non-Hispanic 88.07 < 2e-16 ***
EthnicityHispanic 55.49 < 2e-16 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic -7.06 1.65e-12 ***
EthnicityUnknown 46.51 < 2e-16 ***
Age_Group25-39 years 80.12 < 2e-16 ***
Age_Group40 years and over 1.67 0.0942 .
Age_GroupUnknown NA NA
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model selection table
cnd((Int)) dsp((Int)) cnd(Age_Grp)
allno_glmmodel_nb 3.516 + +
all_glmmodel_nb -8.797 + +
ethnicity_agegroupno_glmmodel_nb 3.425 + +
ethnicity_agegroup_glmmodel_nb -8.887 + +
cnd(Dbb_Era) cnd(Eth) cnd(off(log(Liv_Brt)))
allno_glmmodel_nb + +
all_glmmodel_nb + + +
ethnicity_agegroupno_glmmodel_nb +
ethnicity_agegroup_glmmodel_nb + +
offset df logLik AICc delta weight
allno_glmmodel_nb 11 -2269.287 4561.1 0.00 0.969
all_glmmodel_nb l(L_B) 11 -2272.739 4568.0 6.90 0.031
ethnicity_agegroupno_glmmodel_nb 10 -2311.825 4644.1 82.99 0.000
ethnicity_agegroup_glmmodel_nb l(L_B) 10 -2313.794 4648.0 86.93 0.000
Abbreviations:
offset: l(L_B) = 'log(Live_Births)'
Models ranked by AICc(x)
Random terms (all models):
cond(1 | Year)
Our Chosen model in Regression equation format:
\[ \begin{align*} \log(\mathbb{E}[\text{Maternal Deaths}_i]) &= \beta_0 + \beta_1 \cdot \text{Black}_i \\ &+ \beta_2 \cdot \text{White}_i + \beta_3 \cdot \text{Hispanic}_i \\ &+ \beta_4 \cdot \text{American Indian or Alaska Native}_i \\ &+ \beta_5 \cdot \text{EthnicityUnknown}_i \\ &+ \beta_6 \cdot \text{Age 25-39}_i + \beta_7 \cdot \text{Age 40 Plus}_i \\ &+ \beta_8 \cdot \text{Post Dobbs}_i + b_{\text{Year}[i]} + \log(\text{Live Births}_i) \end{align*} \]
| Maternal Deaths | |||||
|---|---|---|---|---|---|
| Predictors | Log-Mean | std. Error | CI | Statistic | p |
| (Intercept) | -8.80 | 0.08 | -8.95 – -8.65 | -114.57 | <0.001 |
| Ethnicity [Black, Non-Hispanic] |
1.32 | 0.03 | 1.26 – 1.37 | 50.08 | <0.001 |
| Ethnicity [White, Non-Hispanic] |
0.28 | 0.03 | 0.23 – 0.33 | 10.88 | <0.001 |
| Ethnicity [Hispanic] | 0.16 | 0.03 | 0.11 – 0.22 | 5.99 | <0.001 |
| Ethnicity [American Indian or Alaska Native, Non-Hispanic] |
1.65 | 0.06 | 1.53 – 1.78 | 26.29 | <0.001 |
| Ethnicity [Unknown] | 0.04 | 0.03 | -0.01 – 0.10 | 1.59 | 0.113 |
| Age_Group25-39 years | 0.39 | 0.02 | 0.35 – 0.42 | 21.31 | <0.001 |
| Age Group [40 years and over] |
1.81 | 0.02 | 1.77 – 1.85 | 88.17 | <0.001 |
| Dobbs Era [Post-Dobbs] | -0.22 | 0.02 | -0.27 – -0.18 | -9.59 | <0.001 |
| Random Effects | |||||
| σ2 | 8.00 | ||||
| τ00 Year | 0.03 | ||||
| ICC | 0.00 | ||||
| N Year | 6 | ||||
| Observations | 518 | ||||
| Marginal R2 / Conditional R2 | 0.056 / 0.059 | ||||
| Maternal Deaths | |||
|---|---|---|---|
| Predictors | Incidence Rate Ratios | CI | p |
| (Intercept) | 0.00 | 0.00 – 0.00 | <0.001 |
| Ethnicity [Black, Non-Hispanic] |
3.73 | 3.54 – 3.92 | <0.001 |
| Ethnicity [White, Non-Hispanic] |
1.33 | 1.26 – 1.40 | <0.001 |
| Ethnicity [Hispanic] | 1.18 | 1.12 – 1.24 | <0.001 |
| Ethnicity [American Indian or Alaska Native, Non-Hispanic] |
5.22 | 4.61 – 5.90 | <0.001 |
| Ethnicity [Unknown] | 1.04 | 0.99 – 1.10 | 0.113 |
| Age_Group25-39 years | 1.47 | 1.42 – 1.53 | <0.001 |
| Age Group [40 years and over] |
6.11 | 5.87 – 6.36 | <0.001 |
| Dobbs Era [Post-Dobbs] | 0.80 | 0.77 – 0.84 | <0.001 |
| Random Effects | |||
| σ2 | 8.00 | ||
| τ00 Year | 0.03 | ||
| ICC | 0.00 | ||
| N Year | 6 | ||
| Observations | 518 | ||
| Marginal R2 / Conditional R2 | 0.056 / 0.059 | ||
American Indian or Alaska Natives have the highest maternal mortality rate, despite having the smallest number of maternal deaths
Black women had the second highest maternal mortality rate, and the second highest total maternal deaths
Maternal deaths and maternal mortality rate increase from 2019 until February 2022, and then decreases
Based on the provisional data maternal mortality rate did not seem in increase post-Dobbs
The data used is provisional and updates quarterly with both new and old counts, so further analysis may offer differing results
The data offered counts by age group and ethnicity, but not both (i.e. maternal deaths for black women 40 and over). The inclusion of such data would give a better indication of the relationship between the two subgroups.
Due to the Covid-19 pandemic’s impact on the healthcare system, access to regular healthcare was restricted. This likely had an impact on maternal mortality and may partially account for increased rates from 2020-2023.
A study utilizing state specific data (due to differing laws regarding abortion) might give a clearer indication of the impact of Dobbs on maternal mortality.